% chap1.tex {Introductory Chapter}

\chapter{Introduction}

\section{Data Processing}

Processing engineering and scientific data is one of the main functions of 
computing.  This processing is needed if we are interested in the value of 
some physical quantity $y$, and it is difficult or impossible to
measure this quantity directly. To find an estimate $\tilde y$ of such a
quantity $y$, we measure other quantities $x_1,\ldots,x_n$ that are related
to $y$ in a known way, and then compute $\tilde y$ from the results $\tilde
x_1,\ldots,\tilde x_n$ of measuring $x_1,\ldots,x_n$. As an example, it
is impossible to directly measure the amount of oil in a well; thus,
geophysicists measure ultrasound conductivity and other measurable
parameters, and then estimate the amount of oil based on the values of
these measurements. 

In such situations, we have an algorithm ${\cal U}_f$ specifying a function $f$ 
that transforms the values of directly measured quantities $x_1,\ldots,x_n$ 
into the value of the desired quantity $y=f(x_1,\ldots,x_n)$.
To estimate the value of the desired quantity, we apply this algorithm 
${\cal U}_f$ to the measurement results $\tilde x_1,\ldots,\tilde x_n$ and 
compute the estimate $\tilde y=f(\tilde x_1,\ldots,\tilde x_n)$. 

\section{Estimating the Accuracy of the Results of Data Processing}

In many cases, the algorithm specifying a function $f$ used in data
processing is {\it precise} in the sense that in the ideal situation
where we use the exact (actual) values of $x_i$ as input, the algorithm
returns the exact value of the desired quantity $y$.

In reality, however, measurements are almost never absolutely exact; i.e.,
the results $\tilde x_i$ of direct measurements are (in general) somewhat
different from the actual values $x_i$.  Because of these {\em measurement
errors\/} $\Delta_{x_i}=\tilde x_i-x_i$, the result
$f(\tilde x_1,\ldots,\tilde x_n)$ of data processing is, in general,
different from the actual value $y=f(x_1,\ldots,x_n)$ of the desired quantity.

It is clear that we need to know how accurate the results of data processing
are in order to make a qualitative decision about these results.  In other
words, we need to know what the possible values of the error
$\Delta_y=\tilde y-y$ are as a result of data processing.
For example, whether we decide to drill for oil depends on the estimate of
the amount of oil in the area {\em and\/} on the accuracy of this estimate.
Suppose we require a minimum of 50 million tons of oil to make drilling
profitable.  If we compute an estimate of 100 million tons of oil in an
area by processing data obtained from preliminary measurements, should we 
decide to start drilling?  If this estimate is very inaccurate (e.g., an
absolute accuracy of 100 million tons would mean that the actual value could
be any value from 0 to 200 million tons), then further measurements are
warranted if they will lead to a more accurate estimate.

The desire to solve a particular case of the general problem of determining
the accuracy of such indirect measurements is the motivation for the work in
this thesis.

\section{Traditionally, Statistical Methods Are Used}

Since the error in the computed value of $y$ comes from measurement
errors, we must use some information about the possible values of the
measurement errors in order to determine the maximum possible error (or the
likelihood of different possible values of the error) in the computed value
of $y$.

In some cases, for each measuring device we know not only the values of the
measurement errors that are possible, but also the {\em probabilities\/} of
the different possible values. In such cases, we can use statistical methods
to estimate the statistical characteristics of the error in the computed value
of $y$. These are the methods traditionally used by
engineers and scientists (see,~e.g.,~\cite{Rabinovich1995}).

\section{Statistical Methods Are Not Always Applicable}

In spite of the success in the use of statistical methods for some special
cases, in many real-life situations the probabilities of the different error
measurement values are not known. These are common situations in manufacturing
and robotics, for example, where various sensors are used. In such situations, 
the only information we may have is the set of possible measurement error 
values; thus, the traditional statistical methods are not applicable.
For a typical example, many measuring instruments used in industry have a
manufacturer-provided {\em guaranteed\/} absolute accuracy, i.e., a number
$\Delta^{max}>0$ that the absolute value of the measurement error cannot
exceed.

Consequently, if a measured value is $\tilde x$ and the guaranteed absolute
accuracy of the measuring device is $\Delta^{max}_x$, then the actual value
$x$ is guaranteed to lie in the {\em interval\/} $[\tilde x-\Delta^{max}_x,
\tilde x+\Delta^{max}_x]$.
In the following text, we will denote this interval as $[x^-,x^+]$ or simply
as ${\bf x}$. As an example, if a thermometer has an accuracy of
$\Delta^{max}_T=2^{\circ}F$, and it reads $\tilde T=40^{\circ}F$, the actual
value $T$ of the temperature lies between $\tilde T-\Delta^{max}_T=
38^{\circ}F$ and $\tilde T+\Delta^{max}_T=42^{\circ}F$, i.e.,
$T\in[38^{\circ}F,42^{\circ}F]$.

\section{Interval Computations}

In such situations, since we only know the {\em intervals\/}
$[x^-_i,x^+_i]$ that
contain the input data $x_i$, we cannot compute the exact value of
$y=f(x_1,\ldots,x_n)$; we can only compute the {\it interval}
$[y^-,y^+]=\{f(x_1,\ldots,x_n)|x_1\in[x_1^-,x_1^+],
\ldots,x_n\in[x_n^-,x_n^+]\}$
of possible values of the desired quantity
$y$.  This interval is usually denoted by
$f({\bf x}_1,\ldots,{\bf x}_n)$.

Computing the bounds $y^-$ and $y^+$ is an example of {\em interval
computations}.  The history of interval computations is usually said to have
started with the pioneer work of R.~E.~Moore.  In 1959, while working for
Lockheed Space and Missile Corporation, Moore wrote a technical report
containing a description of what we now call interval computations
(see~\cite{Moore1959}).  For a survey of the modern state of interval
computations, see, e.g.,~\cite{Kearfott1996}.
\nocite{Kreinovich1996b}
\nocite{Kulisch1981}
\nocite{Moore1966}

\section{A Typical Data Processing Problem: Solving Linear Systems}

Occasionally, there are known explicit formulas for describing a desired value
$y$ directly in terms of measurable parameters $x_1,\ldots,x_n$. For example,
if we use Ohm's law $V=I\cdot R$, we can compute the voltage ($y=V$) by simply
measuring the current ($x_1=I$), measuring the resistance ($x_2=R$), and
applying the simple formula $y=x_1\cdot x_2$.

In most cases, however, the known dependency between $x_i$ and $y$ is
{\it implicit}: we have some equations that connect $x_i$ and $y$, but
we do not have exact formulas that describe $y$ in terms of $x_i$.  These
equations typically include, in addition to $x_i$ and $y$, some other
quantities that we may not be
directly interested in, but that we have to compute if we want to find
$y$. For example, in the geophysical problem, we may have to determine
certain geophysical parameters like the density of the rocks at different
depths before we are able to estimate the amount of oil.

Thus, we may have several unknown quantities $y_1,\ldots,y_q$
that are related to the directly measured quantities
$x_1,\ldots,x_n$. Usually, each measurement $x_i$ leads to a new restriction
on the values on the unknown quantities, i.e., to a new equation
$F_i(y_1,\ldots,y_q,x_i)=0$ that relates the unknown values $y_j$ and
the measurement result $x_i$. Therefore, we have (in general) a system of
non-linear equations from which we must determine the values $y_j$.

Often, we know the approximate values $y^{(0)}_j$ of the unknown
quantities $y_j$. In such situations, we need only compute differences
$\Delta_{y_j}=y_j-y^{(0)}_j$.  In terms of
these differences, the equations take the form
$$F_i(y^{(0)}_1+\Delta_{y_1},\ldots,y^{(0)}_q+\Delta_{y_q},\tilde x_i)=0.$$
Since the differences are small, if each function $F_i$ is smooth and
continuous then we can expand every function $F_i$ into Taylor series,
neglect terms that are quadratic or of higher order in $\Delta_{y_j}$, and
get a system of equations that are {\em linear\/} in terms of the unknowns.
In such cases, we must solve a system of linear equations in order to
find (close approximations for) the unknown values.

Thus, we see that solving systems of linear equations is a typical example of
data processing. In this thesis, we will be analyzing the problem of
error estimation for such data processing algorithms.

In order to avoid possible confusion, please note
that in the previous text, we followed the standard denotation of
measurement analysis and denoted by $x_i$ the
values of the directly measured quantities.
For linear equations, $x$ usually denotes an
unknown, i.e., a desired result of data processing. In the following
text, we will only be considering systems of linear equations, and thus, 
$x$ will denote an unknown in a system of linear equations hereafter. In 
these denotations, a general system of linear equations will take the 
following form: 
\begin{equation}
  \sum_{j=1}^n a_{ij}x_j=b_i.                      \label{eq:sys-lin-eq}
\end{equation}

\section{Interval Linear Systems}

The coefficients of linear equations (i.e., $a_{ij}$ and $b_i$) often come 
as results of measurements.  Let us assume that we only know the interval
of possible values associated with each measurement result; i.e., traditional
statistical methods are not applicable. From this we conclude that
\begin{itemize}
\item we only know the intervals ${\bf b}_i$ that contain the actual values
      of $b_i$, and
\item we only know the intervals ${\bf a}_{ij}$ that contain the actual
      values of $a_{ij}$.
\end{itemize}
Thus, we have a system of linear equations with interval coefficients.
We call such a system a {\em system of interval linear equations}.

Since the only thing we know about the actual
values of the measured quantities
is that they lie within some intervals ${\bf a}_{ij}$ and ${\bf b}_i$,
we can rewrite system (\ref{eq:sys-lin-eq}) as
\begin{equation}
  \sum_{j=1}^n {\bf a}_{ij}x_j={\bf b}_i.         \label{eq:sys-int-lin-eq}
\end{equation}
We say that a tuple $(x_1,\ldots,x_n)$ is a {\em solution\/} to this system
(\ref{eq:sys-int-lin-eq}) of interval linear equations if for some
$a_{ij}\in {\bf a}_{ij}$ and $b_i\in {\bf b}_i$
$$
  \sum_{j=1}^n a_{ij}x_j=b_i
$$
for all $i=1,\ldots,m$.  The set of all solutions to system
(\ref{eq:sys-int-lin-eq}) is usually denoted by $X$.

Note that for different solutions $(x_1,\ldots,x_n)\in X$ we may get different
values of $x_j$; therefore, since we cannot find a unique value for $x_j$, we
need to find the {\em interval of possible values of $x_j$} for all
$j=1,\ldots,n$.  In other words, we need to find the interval
${\bf x}_j=[x^-_j,x^+_j]$ where
$$
  x^-_j=\min\{x_j\,|\,(x_1,\ldots,x_n)\in X\}
$$
and
$$
  x^+_j=\max\{x_j\,|\,(x_1,\ldots,x_n)\in X\}
$$
for all $j=1,\ldots,n$.
By {\em solving\/} a system of interval linear equations, we mean
computing the bounds $x_j^-$ and $x_j^+$ of the interval ${\bf x}_j$
for all $j=1,\ldots,n$.

\section{Example}

Let us take a simple system of interval linear equations
$$
  {\bf a}_{11}x_1={\bf b}_1
$$
with one unknown ($n=1$) and one measurement ($m=1$), for which:
\begin{itemize}
\item as a
result of measuring $b_1$, we get $\tilde b_1=2.5$;
\item as a result of measuring $a_{11}$, we get $\tilde a_{11}=1.5$;
\item the absolute accuracy of both measurements is
      $\Delta^{max}_a=\Delta^{max}_b=0.5$.
\end{itemize}
This means that ${\bf b}_1=[2.5-0.5,2.5+0.5]=[2,3]$ and
${\bf a}_{11}=[1.5-0.5,1.5+0.5]=[1,2]$.
Therefore, the system (\ref{eq:sys-int-lin-eq}) reduces to the single interval
linear equation
$$
  [1,2]x_1=[2,3].
$$
To solve this system, we need to find the solution set $X$ to this problem; 
i.e., we need to find the endpoints $x_1^-$ and $x_1^+$ of the interval
${\bf x}_1$ such that $x_1\in{\bf x}_1$.  Here $x_1=b_1/a_{11}$ where
$a_{11}\in{\bf a}_{11}$ and $b_1\in{\bf b}_1$.  The value of $x_1$ is smallest
when $b_1$ takes the smallest possible value in ${\bf b}_1$ and $a_{11}$ takes
the largest possible value in ${\bf a}_{11}$.  Similarly, the value of $x_1$
is largest when $b_1$ takes the largest possible value in ${\bf b}_1$ and
$a_{11}$ takes the smallest possible value in ${\bf a}_{11}$.  Hence,
$$
  x_1^-=\frac{\min({\bf b}_1)}{\max({\bf a}_{11})}
       =\frac{\min([2,3])}{\max([1,2])}=\frac{2}{2}=1
$$
and
$$
  x_1^+=\frac{\max({\bf b}_1)}{\min({\bf a}_{11})}
       =\frac{\max([2,3])}{\min([1,2])}=\frac{3}{1}=3.
$$
Thus, ${\bf x}_1=[1,3]$ and $X=\{(x_1)|x_1\in[1,3]\}$.

\section{Solving Interval Linear Systems Is NP-Hard}

There exist many useful algorithms for solving interval linear
systems (see,~e.g.,~\cite{Neumaier1990}); however, all these algorithms
face the following problem: there are systems of linear interval
equations of size $n\times m$, for which the running time grows exponentially
with $n$.  As a result, even for reasonably small $n$, the running time can
be longer than the estimated lifetime of the universe!

It was later shown that this problem is not caused by the imperfection
of the algorithms, but by the computational complexity of the problem
itself; i.e, it was shown (see~\cite{Kreinovich1993}) that this problem
is {\em computationally intractable\/} (i.e., NP-{\em hard\/})
(see~\cite{Garey1979}). Crudely speaking, this means that unless {\em all\/}
possible problems can be solved in reasonable
time\footnote{This highly improbable hypothesis is usually denoted as 
``P=NP.''}, any algorithm for solving this problem must go exponential in some
case.  We will look at this in more detail in later chapters.

\section{What If Intervals Are Narrow?}

Most measuring devices used in industry are reasonably accurate, and thus,
the intervals associated with measurements are almost always narrow.  It would
be nice if we could write a computer program that could solve interval linear
systems with narrow intervals since this appears to be the sort of problem
typically seen in data processing.
If we then restrict ourselves to narrow intervals only, will the problem
of solving interval linear systems still be computationally intractable?

Of particular relevance to this question is the fact that for narrow
intervals there exists an algorithm that solves {\em almost all\/} interval
linear systems in reasonable time (see~\cite{Kreinovich1996a,Lakeyev1995}).  
We will look at this in more detail in Chapter~\ref{MostNarrowEasy}.

\section{Our Result}

In spite of the above optimistic result, we will show that {\em the problem
of solving interval linear systems with narrow intervals is computationally
intractable (i.e., \/{\rm NP}-hard)}.  In other words, (unless P=NP) there
exists no algorithm that can solve every narrow-interval linear system of
reasonable size in a reasonable amount of time.  Thus, it is believed that no
one will ever be able to write a computer program that can solve
narrow-interval linear systems in general, no matter how fast computers of
the future become.  We will look at this problem in more detail in later
chapters.  The proof of this result can be found in Chapter~\ref{NewResult}.
